Abstract: Sentiment Analysis (SA) and summarization has recently become the focus of many researchers, because analysis of online text is beneficial and demanded in many different applications. One such application is product-based sentiment summarization of multi-documents with the purpose of informing users about pros and cons of various products. This paper introduces a novel solution to target-oriented sentiment summarization and SA of short informal texts with a main focus on Twitter posts known as “tweets”. We compare different algorithms and methods for SA polarity detection and sentiment summarization. We show that our hybrid polarity detection system not only outperforms the unigram state-of-the-art baseline, but also could be an advantage over other methods when used as a part of a sentiment summarization system. Additionally, we illustrate that our SA and summarization system exhibits a high performance with various useful functionalities and features. Sentiment classi?cation aims to automatically predict sentiment polarity (e.g., positive or negative) of users publishing sentiment data (e.g., reviews, blogs). Although traditional classi?cation algorithms can be used to train sentiment classi?ers from manually labeled text data, the labeling work can be time-consuming and ex-pensive. Meanwhile, users often use some different words when they express sentiment in different domains. If we directly apply a classi?er trained in one domain to other domains, the performance will be very low due to the differences between these domains. In this work, we develop a general solution to sentiment classi?cation when we do not have any labels in a target domain but have some labeled data in a different domain, regarded as source domain.
Keywords: Sentiment analysis, label data, sentiment polarity, sentiment classification.